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Abstract 

. We present a method for estimating the edge of a two-dimensional bounded set, given a 

finite random set of points drawn from the interior. The estimator is based both on a Parzen- 
Rosenblatt kernel and extreme values of point processes. We give conditions for various kinds 
of convergence and asymptotic normality. We propose a method of reducing the negative bias 
and edge effects, illustrated by a simulation. 
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0\ '. 1 Introduction 

. We address the problem of estimating a bounded set S of given a finite random set N of 

' points drawn from the interior. This kind of problem arises in various frameworks such as classi- 

fication [16 , image processing |20) or econometrics problems (5]. A lot of different solutions were 
proposed since [7] and [33] depending on the properties of the observed random set N and of the 
unknown set S. In this paper, we focus on the special case where S* = {(x, y) G I < x < 
Jv>( ' 1 ; < y < fix)}, with / an unknown function. Thus, the estimation of the subset S reduces 

5^ , to the estimation of the function /. This problem arises for instance in econometrics where the 

function / is called the production frontier. This is the case in [15] where data consist of pairs 
{Xi,Yi), Xi representing the input (labor, energy or capital) used to produce an output Yi in a 
given firm i. In such a framework, the value f{x) can be interpreted as the maximum level of 
output which is attainable for the level of input x. 

Most papers on support estimation use to consider the random set of point N appearing under 
the frontier / as a n-sample. However, in practice, the number as well as the position of the 
points is random, so we do prefer for a long time to deal with point processes. Cox processes are 
known to provide a high level of generality among the point processes on a plane. However, after 
conditioning the intensity, the realization of a Cox process is merely the one of a Poisson point 
process, so what is really observed is a Poisson point process. Moreover in most applications such as 
medical imaging / delimits a frontier between two zones. A contrasting substance is spread on the 
whole domain, for instance the brain. The magnetic resonance imaging only displays the bleeding. 
So the healthy part acts as a mask. Inversely, but similarly, when investigating the retina, the 
patient does not detect the small luminous spots pointed on a destroyed area. In such cases, there 
is no way to consider the remaining observed points as a random sample. In fact, such truncated 
empirical point processes are no longer random samples but binomial point processes (see [22]). 
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In fact, even the nature is unable to obtain a random sample on S in this way! It turns out that 
binomial point processes are well approximated by Poisson processes. Moreover, truncated Poisson 
point processes are still Poisson point processes and the same is true for general Cox processes. 
Naturally, as our point of view is not prevailing, we have to preserve the possibility of comparing 
our results with those of authors dealing with random samples. So in place of a uniform n-empirical 
process on S with the distribution X/X(S), we consider a Poisson point process with the intensity 
nX/X(S), where A denotes the Lebesgue measure. The intensities of the two processes are obviously 
equal. Finally, we claim that we are able to deduce for samples similar results by means of Poisson 
approximations. But it not so simple to achieve, and we prefer to defer this work to a further 
paper. 

In the wide range of nonparametric functional estimators [2], piecewise polynomials have been 
especially studied [THl HO] and their asymptotic optimality is established under different regularity 
assumptions on /. See [TH [T71 for other cases. Estimators of / based upon orthogonal series 
appear in [1] [T5]. In the case of Haar and C-^ bases, extreme values estimates are defined and 
studied in [9l [THl H] and reveal better properties than those of [18]. In the same spirit, a Faber- 
Shauder estimate is proposed in [6] . Estimating / can also been considered as a regression problem 
Yi = f{Xi) -\- Ei with negative noise e^. In this context, local polynomial estimates are introduced, 
see [13], or [12] for a similar approach. 

Here a kernel method is proposed in order to obtain smooth estimates /„. From the practical 
point of view, these estimates enjoy explicit forms and are thus easily implementable. From the 
theoretical point of view, we give limit laws with explicit speed of convergence (7„ for (t~^ (/„ — E /„) 
and even for (T~^(/„ — /) after reducing the bias. The rate of convergence of the Li norm is proved 
to be 0{n~ s/^+a ) for a a-Lispchitzian frontier /, which is slightly suboptimal compared to the 
minimax rate n~ i+° . Section [2] is devoted to the definition of the estimator and basic properties 
of extreme values. Section |3] deals with ad hoc adaptation of Bochner approximation results. In 
Section |4l we give the main results of convergence: mean square uniform convergence and almost 
complete uniform convergence. We prove, in Section [S] the asymptotic normality of the estimator, 
when centered to its mathematical expectation. Section |6] is devoted to some bias reductions, 
allowing in certain cases asymptotic normality for an estimator, when centered to the function /. 
We also present a technique for avoiding edge effects. In |TT], a simulation gives an idea of the 
improvements carried off by these modifications. Section [7] is dedicated to comparison of kernel 
estimates with the other propositions found in the literature. 

2 Definition and basic properties 

For all n > 0, let A'^ be a Poisson point process with mean measure ncA, where A denotes the 
Lebesgue measure on a subset S of M? defined as follows: 



The normalization parameter c is defined by c = 1/A(5') such that E(iV(5')) = n. We assume that 
on [0, 1], / is a bounded measurable function, strictly positive and a-Lipschitz, < a < 1, with 
Lipschitz multiplicative constant L f and that / vanishes elsewhere. We denote by m (and M) the 
lower (and the upper) bound of / on [0, 1]. Given (ft,„) a sequence of positive real numbers such 
that /i„ — >■ when n — > oo, the function / is approximated by the convolution: 



S = {{x,y) GR2|0<x<l;0<y< f{x)}. 



(1) 




(2) 



where Kn is given by 
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and K is a bounded positive Parzen- Rosenblatt kernel i.e. verifying: 

Vx e R, < < supiv: < +00, I K{t)dt^l, lim xK{x)^{). 

R JR \x\^oo 

Note that K'^ and are Lebesgue-integrable. In the sequel, we introduce extra hypothesis on K 
when necessary. Consider (/c„) a sequence of integers increasing to infinity and divide S into fc„ 
cells Dn r with: 



Dn.r = {{X,y) & S \ x € In,r }, In,r = 



r — 1 r 



1 , . . . , hyi . 



h ' h 

The convolution ([2]) is discretized on the {In,r} subdivision of [0, 1]: 

fn{x) ^ —^Kn{x ~ Xr)f{Xr), X&[Q,l], 

where Xj. is the center of In,r- The values f{xr) of the function on the subdivision are estimated 
through X*^^ the supremum of the second coordinate of the points of the truncated process N{. D 
Dn,r)- The considered estimator can be written as: 



/"(^) = irE^"(^-^'-)^»." 2^e[o,i]. (3) 

r— 1 

Formally, this estimator is very similar to the estimators based on expansion of / on bases [51 [TU] 
although it is obtained by a different principle. Besides, combining the uniform kernel K{t) = 
l[-i/2,i/2] (^) with the bandwidth ft,„ = l/fc„ yields Geffroy's estimate: 

*;„ 

f^{x)^Y.^i„MXl,, xe[0,l], (4) 

which is piecewise constant on the {In,r} subdivision of [0, 1]. At the opposite, here we focus on 
smooth estimators obtained by considering smooth kernels in ([3]). More precisely, we examine 
systematically the convergence properties of the estimator in two main situations: 

(A) K is /3-Lipschitz on R, < /3 < 1, with Lipschitz multiplicative constant Lk, x — > x'^K{x) is 
integrable, /c„ = o(n), /i„fc" — > 00, and 00 when n — 00. 

(B) K has a compact support, a bounded first derivative and is piecewise C^, fc„ = o{n) and 
hnkn ^ 00 when n — > 00. 

Of course, Geffroy's estimate does not fulfil these conditions. Some stochastic convergences will 
require extra conditions on the (A;„) sequence: 

(C) fc„ = o (n/lnn) and n — o (fc,\+"). 
Throughout this paper, we write: 

HDn.r) = Xn,r, Hliu f (x) = m„,r, max f (x) = M„.r. 

The cumulative distribution function of X* ^ is easily calculated on [0, m„ ^1: after noticing that, 
for every measurable B C S, P{N{B) = 0) = exp {—ncX{B)): 

Fn,rix) = P{X*^^ < x) = exp (^{X - knXn,r)j , xe[0,m„,r]- (5) 
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Of course, -F'„.r(a;) = if x < and Fn.rix) = 1 if x > Mn,r- For x G [m„ Mn ,.], Fn.r{x) is 
unknown, but 1— -F'n,r(TOn,r) can be controlled through regularity conditions made on /. Finally, ([SJ 
and this control provide precise expansions for the first moments of X* ^ . We quote that useful 
results in the following lemma. 

Lemma 1 Assume (C) is verified. Then, 



(i) max 

r 

(ii) max 



Var{X*) 



^ O 



n 



l+2a ) ' 



= o 



1,2a 



(iii) maxi?(|x;,, = O (^^ 

We shall also need a lemma on the Parzen-Rosenblatt kernel. 
Lemma 2 Let a ^ 0. For any probability sequence {Pn), we have 



j K{u)K + PrXdu) - (/i„) . 



Proof of all lemmas are postponed to the Appendix 
Corollary 1 For all x ^ y, 



h h 



r=l 



hn 



K 



K 



0(1) 



This result is deduced from Lemma [5] with a = y ~ x and P„ = - — > S x- 



(6) 



3 Bias convergence 

We first give conditions on the sequences (ft.„) and (fc„) to obtain the local uniform convergence of 
/ra to /, that is, the uniform convergence on every compact subset C of ]0, 1[. Of course, since / 
is not continuous at and 1, we cannot obtain uniform convergence on the whole compact [0, 1]. 
We note in the sequel: 

ll.9ll^ = sup \g(x)\ , 
for all function g : [0, 1] M. The triangular inequality 

11/ - /nll^ < ||/« -5n|loc + - /ll^' 

shows the two contributions to the bias. The first term, studied in Lemma [31 is a consequence 
of the discretization of The second term is studied in Lemma 01 It appears in various other 
kernel estimates such as regression or density estimates. 

Lemma 3 

(i) f/We.(A), 

(ii) Under (B), ||/„ - 5nlL = O (^) + ^ (fc^) " 



4 



The function / is uniformly continuous on [0, 1] as soon as it is continuous on the same compact 
interval and the Bochner Lemma entails that Hffn — / ||^ — ^ as n — oo. The following lemma 
precises this result by providing the rates of the convergence of ||g„ — /||^ in different situations. 

Lemma 4 

c 



(i) If X — > X K{x) is integrable, then — /||oo ^ O I hn 



(ii) If K has a compact support then \\gn — f\\^ — O (/i") 



Let us note that in situation (B), — o{h"). Thus, as a simple consequence of Lemma [3] and 

Lemma m we get: 



Proposition 1 

(i) Under (A): ||/„-/||^-0^ ^ 



(ii) Under (B): ||/„ - /||^ = O 



1 



o 



O h 



OihZ). 



In either case, fn converges uniformly locally to f. 

Applying Proposition[l]to the function l[o,i] leads to the following corollary which will reveal useful 
in the following. 

Corollary 2 Under the conditions of Proposition^^ 



lim 



4 Estimate convergences 

This section is devoted to the study of the stochastic convergence of /„ to /. We establish suf- 
ficient conditions for mean square local uniform convergence and almost complete local uniform 
convergence. 

4.1 Mean square local uniform convergence 

In this paragraph, we give sufficient conditions for 

\Ux)~f{x)f 



sup E 



as n — >■ (3o, 

where C is compact subset of ]0, 1[. The well-known expansion 



E 



{Ux)^f{x)f = E{Ux))~f{x) +Var(/„(x)) 



allows one to consider the bias term and the variance term separately. The two following lemmas 
are devoted to the bias which splits in turn as 



E(/„) - / 



< 



E(/„)-/„ +|l/„-/|| 



c 
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Lemma 5 Suppose (C) is verified. Under (A) or (B); E(fn) — fn = O (kn/n). 

oo 

As a consequence of Lemma [5] and Proposition [l] we obtain the behavior of the bias: 
Lemma 6 Suppose (C) is verified. 
(i) Under (A); 



Eifu) - f 



(ii) Under (B): 



O 



E{fn) ~ f 



O 



h h*^ 



o 



1 



' '"11 



0[K 



c 



^ O 



o 



h2 U2 



o iK) . 



(7) 



To conckide, it remains to consider the variance term. 
Lemma 7 Suppose (C) is verified. Under (A) or (B); 



hm 

n— )-oo 



where ct„ = and a = Jli^^t . 

In situation (B), ct„ — o(fc„/ri), and therefore the variance of the estimator is small with respect 
to the bias. In both situations, as a consequence of Lemma [6] and Lemma [71 we get: 

Theorem 1 Suppose (C) is verified. 

(i) Under (A); 

C 



E{fn - f? 

(ii) Under {B): 



1 



O ( ^ ) +0 



7 2+2,3,2/3 



OIK 



O 



E{fn - ff 



C 



O 



o 



o {hi") 



In either case, the mean square local uniform convergence of /„ to f follows. 
In situation (B), choosing k„ — and /i„ ~ n'^^+^ yields 



E(/„ - f)' 



^ O 



and thus, we obtain the following bound for the Li norm: 



E ||/n-/|U - E / fnix)-f{x) 



dx < 



E(/„ - ff 



1/2 



O n 1+7' 



(8) 



As a comparison, the minimax rate in the n-sample case is n i+" and is reached by GefFroy's 
estimate. A bias reduction method will be introduced in Section [5] in order to ameliorate the 
bound dS]). 
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4.2 Almost complete local uniform convergence 

We shall give sufficient conditions for the convergence of the series 



Ve>0, gp(| 



fn - f 



> e> ] < +00. 



Theorem 2 Suppose kn = o{n/ Inn). Under (A) or (B), /„ is almost completely locally uniformly 
convergent to f. 

Proof : Let C be a compact subset of ]0, 1[ and e > 0. From Proposition [TJ /„ converges 

c 

For a; e C, we have: 



uniformly to / on C. It remains to consider 



fn fn 

fn{x)-fn{x) < -^^Knix - Xr)max\X*,. - f{Xr)\ 



< 1 



— ^A'„(. ~Xr)-l 
i^n 



max I J, - f{xr)\ 



< (l + o(l))max|X*,,-/(a;,)L 



with Corollary [2j Now, since / is continuous on [0, 1], Af„_r — m„ < e/2 uniformly in r, for n 
large enough, and therefore 

{max \Xl^ - f{xr)\ > e} C IJ {f{xr) - X^^ > e} C |J {X*, < m„,, - e/2) . 

r r 

As a consequence, 

p({max|A*,.-/(a;^)| > s}) <Y1 

r=l 

where Fn^r is given by ([5]). Then, the inequality 

p({niax|A* ,,-/(a;0| > e}) ^^»exp(^-^^ 
entails the convergence of the series with kn — o{n/ Inn). 

5 Asymptotic distributions 

In Theorem m we give the limiting distribution of the random variable fn{x) for a fixed x € C, a 
compact subset of ]0, 1[. In Theorem 21 we study the asymptotic distribution of the random vector 
obtained by evaluating /„ in several distinct points of C. 

Theorem 3 Suppose (C) is verified. Under (A) or (B), Sn{x) — a^^ {fnix) ~ E{fn{x))) converges 
in distribution to a centered Gaussian variable with variance cr'^ , for all x ^ C. 

Proof : Let a; G C be fixed. Introducing the fc„ independent random variables 



,3/2,1/2 I 
>^n nn \ 



n / X X I ^ -T^j. 

\ \ (V* 



i^n,r - ^^n,r)^ 
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the quantity Sn{x) can be rewritten as 



Our goal is to prove that the Lyapounov condition 

i^E(|y„,,,(x)|' 

hm V ^ 

n. — ^rxn ' ^ 







^ Var3/2(s„(a;)) 
holds under condition (A), (C) or (B), (C). Remark first that 

Var(s„(x)) = Var(/„(a;))/cr^ ^ c^^ 
as n — > oo with Lemma [T] Second, we have 



< 



O 



,1/2,1/2 j ' 



with Lemma [T] Then, 



K^{u)du 



< / ii:3(M)du + o(l), 
with Corollary [5] applied to the kernel X'^/ j K^{u)du. As a conclusion, 

1 



^E(|r„,,(x)| 



Var=^/'(s„(.T)) V/i^.^^^;,' 



= O 



1/2,1/2 



and the result follows. 



Theorem 4 Let {yi, . . . ,yq) be distinct points in C and denote Iq the identity matrix of size q. Un- 
der the conditions of Theorem\^ the random vector {sn{yj), j = I, . . . ,q) converges in distribution 
to a centered Gaussian vector o/M* with covariance matrix a'^Iq. 

Proof : Our goal is to prove that, V(iti, . . . , Uq) G M'', the random variable 

9 



Sn = y^^UiSniVi) 



converges in distribution to a centered Gaussian variable with variance || u \\\cy'^ ■ A straightforward 
calculation yields 

fcn 

r=l 
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where we have defined 



,3/2, l/z ^ 
'^n "'71 i—i 



^ 3/2,1/2 / y I I ) {-^n,r ^ ^ ■^n,r) 



hn 



We use a chain of arguments similar to the ones in Theorem [3] proof. First, the variance of Yn^r is 
evaluated with Lemma [T](ii): 



.2 /I 



Var(y„^r) = 



h 1-3 1 
1 1 

1 1 



hn 
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Var(X;, 

\ 2 



1 + 



„2a+2 



Then, the variance of s„ can be expanded as 



Var(S„) r^j^Y. 4^K' ^ ~ 



i—1 r—1 



. y . ^ "n'^n T 
r=l 



hn 



hn 



Corollary [2] provides the limit of the first term and, from Corollary [l] the second term goes to 
when n goes to infinity. As a partial conclusion, Var(s„) — > || u ||2f ^ when n oo. Now, we have 



E 



,3/2,9/2 
"n f^n 



E ( \X* .r - E{X^ )\ 



Then, Lemma [T](iii) entails 



EE 



Y 



= o 



'2, 3/2 I 



,3/2,3/2 , ^ 
/^Ti f^n / r—1 



and remarking that 



<ll^llooll"lll E"'^^ 

\i=i 



shows finahy that 



EE 



Yn 



O 



,1/2,1/2 ] ' 



and the conclusion follows. 



6 Bias reduction 

It is worth noticing that, in Section [5l the negative bias of /„ is too large to obtain a limit 
distribution for (/„ — /). We introduce a corrected estimator /„ sharp enough to obtain a limiting 
distribution for (/„ — /) under conditions (B) and (C). 
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It is clear, in view of Lemma [U that is an estimator of knXn.r with a negative bias asymp- 
toticahy equivalent to — fc„/(nc). To reduce this bias, we introduce the random variable defined 

by 



Lemma [T] implies that, under (C), 



n kn 

r— 1 



nc 



E(Z„) = -^ + 0(^). (9) 



This suggests to consider the estimator 

/"(^) = IT H ^"(^ - ^'') r + ^n) , € [0, 1]. 

A more precise version of Theorem |3] can be given in situation (B) at the expense of additional 
conditions. To this end, we need a preliminary lemma providing the bias of the new estimator /„. 



Lemma 8 Under i^), (C); 



Kin) - f 



= o 



0{hZ). 



The bias of the new estimator fn{x) is asymptotically lower than the bias of fn{x) since the 
term of ([7]) is cancelled in Lemma [S] Let us also note that the variance of fn{x) is bounded above 
by the variance of fn{x): Since 



Var fn{x) < 2 Var + 2 ^ /v„(a; - x,)^ 



VarZ„, 



(10) 



it follows from Lemma [1] Lemma [7] and Corollary [5] that 



Var/„(a;) 



< 2 + 



Var Zn 



= 2 + 



kn Var X* ^ 



= 2 + 



h h"^ 



= 2 + o(l). (11) 



Var/„(a;) 

These remarks allow to give the asymptotic distribution of {fn{x) — f{x)). 

Theorem 5 // (B) holds, n = o ^kl/'^hn^^'^ , n = o (jtf/'^hf/'^^ and kn = o (n/lnn) , then 

tnix) = (J~^{fn{x) — f(x)) converges in distribution to a centered Gaussian variable with variance 
, for all x in a compact subset o/]0, 1[. 

Proof : Consider the expansion 

tnix) = CT-\fn{x) ~ E(/„(x))) + CT,; ^ (E(/„ (x) ) - f [x]) 

= ^n\Ux) - Wn{x))) + ^ E ^"(^ - ■^'■)(^" - E(Z„)) + <1(E(/„(X)) - fix)). 



In view of Theorem [3l the first term converges in distribution to a centered Gaussian variable 
with variance cr^. The second term is centered and its variance converges to zero from pop 
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and (jlip . Therefore, the second term converges to in probabihty. The third term is controlled 
with Lemma ISl 



E(/„) - / 



c 



= O 



^3/2,5/2 



o 



1,1/2 



0, 



and the conclusion follows. The uniform mean square distance between /„ and / is derived from 
Lemma [7] and Lemma [HI 



E(/„ - ff 



O 



O 



1 



/?4 i.4 



o [hi-) 



(12) 



Possible choices are A:„ = n-i+sa and hn — n -i+sq leading to 

c 



E(/„ - fY 



= O (n 



and thus, 



E ||/„-/|li =0 n ^ 



(13) 



which is a significant improvement of ([S]). It is well-known that non-parametric estimators based 
on Parzen- Rosenblatt kernels suffer from a lack of performance on the boundaries of the estimation 
interval. To overcome this limitation, symmetrization techniques have been developed [4]. The 
application of such a method to /n(x) yields the following estimator: 

fn{x) = TrYl (^"(^ ~ ^'■^ + ^"("^ + ^'■^ + ^"(^ + " (^"■'- + ' 2; e [0, 1]. 



The convergence properties of /„ and /„ on the compact subsets of ]0, 1[ can be extended to /„ on 
the whole interval [0,1] without difficulties. 



7 Comparison with other estimates 

Let us emphasize that such comparisons are only relevant within a same framework, which ex- 
cludes hypotheses such as the convexity or the monotonicity of /. Thus, the competitive methods 
to our kernel approach are essentially local polynomial estimates [131 112) , piecewise polynomial 
estimates [inilin] and our projection estimate [51 [TU]. 

- From the theoretical point of view, piecewise polynomial estimates benefit from the minimax 
optimality whereas the estimates proposed in this paper are suboptimal. In the class of continuous 
functions / having a Lipschitzian fc-th derivative, the optimal rate of convergence is attained by 
minimizing, on each cell of a partition of [0, 1], the measure of a domain with a polynomial edge of 
degree k. For instance, in the case of a a-Lipschitzian frontier, the minimax optimal rate for the 
Li norm is n^T+^ and the corresponding rate is n~ 5/-i+c» for /„ (see The difference of speed 
increases with a, but even if a = 1 (which is the worst situation for us), one obtains "similar" 
rates of convergence, that is ti^^/^ and In this sense, kernel estimates bring a significant 
improvement to projection estimates. 

- From the practical point of view, all the previous estimates require the selection of two hyper- 
parameters. In case of piecewise polynomial and local polynomial estimators, the construction of 
the estimate requires to select the degree of the polynomial function (which corresponds to k 
in the piecewise polynomial framework) and a smoothing parameter (the size of the cells in the 
piecewise polynomial context and the size of the moving window in the local polynomial context). 
Of course, the selection of the degree of the polynomial function is usually easier than the choice 
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of a parameter on a continuous scale such as h„. Nevertheless, our opinion is that kernel estimates 
are the most pleasant to use in practice for the following reasons. The computation of local 
and piecewise polynomial estimates requires to solve an optimization problem. For instance, the 
computation of piecewise polynomial estimates is not straightforward, at least for fc > 0. When 
fc = 0, piecewise polynomial estimates reduce to Geffroy's estimate, whose unsatisfying behavior on 
finite sample situations has been illustrated in Section ??. At the opposite, kernel and projection 
estimators enjoy explicit forms and are thus easily implementable. Besides, these methods yield 
smooth estimates whereas piecewise polynomial estimates are discontinuous whatever the regularity 
degree of / is. Finally, only kernel and projection estimates benefit from an explicit asymptotic 
distribution. This property allows to build pointwise confident intervals without costly Monte-Carlo 
methods. In the local polynomial estimates situation (see fT^), both the limiting distribution and 
the normalization sequences are not explicit making difficult the reduction of the asymptotic bias. 



8 Appendix : proof of lemmas 

Proof of Lemma[l\ We give here the complete proof of (i) and a sketch of the proofs of (ii) and 
(iii) since the methods in use are similar. 



(i) The mathematical expectation can be expanded in three terms: 



The first term of the sum is asymptotically negligible: 

maxfc„A„ ^e""''^"''' = o (n~'') , (14) 

for all s > 0, when n — > cxd. Using ([S]), the second term can be rewritten as 



(x - knXn.r)K ri^)dx 



{x - knXn,r) exp 



kn Jo 



kn 

uexp {—u)du 



[x - kn\ n,r I 



dx 



Let us note </!)(?/) = (u + 1) exp (— u) a primitive of — wexp (— u). We have <j){u) = 1 + (u^) 
when u ^ and (/'(w) = o(u~*'), Vs > when u oo. Consequently, remarking that the 
upper bound goes to infinity, and that the lower bound goes to under the assumption 
n = a yields 



max 



{x - knXn,r)F^j,{x)dx + 



= O 



n 



l+2a 



(15) 



The third term is bounded above by 

(x krfiXri^r)Frfi^r{^dx^ ^ (^Adji^r '^tIjt) 



1 - exp ( —{mn,r - knXn,r) 



and thus 



(x kfiXii^r')Fyi^','{^dx^ 
Collecting (Ull), and (HH) proves the resuh. 



O 



(16) 
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(ii) It is convenient to write the variance as 

Var(X;J = Var(X* , - fc„A„,,) = E [(X* , - fc„A„,,)'] " ^\K,r " fcnA„,,). 
With a method very similar to the one used to prove (i), we obtain uniformly in r, 



9k^ / 1 



Besides, (i) entails 



2 * 



uniformly in r, and the conclusion follows. 

I 1 3 

(iii) The proof is similar. It requires the calculation of E(|X* ^ — fc„A„ ^1 ) and the use of (i) and 
(ii). 

Proof of Lemma\^ Let e > and split ([6]) into 



K{u)K [u+ -^^PrXdu) = / (u+ ) 



+ / K{u)K(u+^]Pn{du) 



and consider the two terms separately. 
• The first term is bounded above by 



/ , , K{u)K(u+f]p.r,{du) < \\K\\^ [ 



K{u)Pn{du) (17) 

= II^IL / , , \u\K{u)^Pr.{du). 

Since uK{u) — s- when |u| — s- oo, for n large enough \uK{u)\ < e entailing 

K{u)K (u+^) Pr.{du) < \\K\\^e [ ^Pn{du) < .^^^"ll^ll- 



a 



We have proved that Ve > 0, for n large enough 

2£||if|lc 

l\u\>. 



^[ K{u)K(u+^)p,,{du)< 



or equivalently, 



K{u)K\u+^]Pr,{du) = o{K). 



• The second term is bounded above by 

/ K{U)K (u + ^\ PrXdu) < \\K\\^ I k(u+^\ Pnidu) 

= ll^lloo / , , K{v)Pr.{dv), 

with V = u + a/hn, and the end of the proof is the same as for ([T7|. 
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Proof of Lemma\^ Taking into account that / vanishes out of [0, 1], we have 

fn{x) - gn{x) = 7- X! -^"(^ ~ Xr)f{xr) - / Kn{x - y)f{y)dy. 
Let us define <f)n.x{y) — Kn{x — y)f{y) for {x,y) G M?. With this notation, 

fen „ , fen „l/2 - 



friix) - gn{x) [(f>n.x{Xr) - <t)n,xiy)] ^-V = 1-^2 / 



du. 
(18) 



and we have the foUowing expansion: 



(t>7i,x{^r) - 4>n,x yXr + —j = Kn yx - Xr - J^j (^fiXr) - f \^Xr + —j j (19) 

+ f{Xr)(^K,i{x-Xr)-Kn(^X-Xr~-^^^, (20) 

Now, since / is a-Lispchitz, ([19]) is uniformly bounded above by \\K\\^L f / {hnk"). The rest of the 
proof depends on the assumptions made on K: 

(i) Under (A), K is /3-Lipschitzian and thus (|20)) is uniformly bounded above by \\f\\ao^K/{hl^'^'^k'^), 

and the conclusion follows. 

(ii) Under (B), since K has a compact support, the number of nonzero terms in (jl8p is O (knhn). 

Thus, the contribution of (|T9t is 0(l/fc^). Two situations have to be considered for the 
term (l20t . If r is such that K has only a bounded first derivative at x — Xr, then ((20)) is 
uniformly bounded above by ||/||ool|-?f'||oo/(''-n^")- Remarking there are only a finite number 
of such terms in (fT8|) shows that the contribution of pop is O (l/(/i^fc^J). If r is such that 
K is at X — Xr, then a second order Taylor expansion yields 

Kn{x - Xr) - Kn ( X ~ Xr ~ ^) = ^K'^{x ~ Xr) - TTTJ-^n \X - Xr ^' Oniu)^ ) , 

with 6'„(u) e]0, 1[. Replacing in ([T8)) . the first order term vanishes, and thus the contribution 
of dini) is bounded above by ||/IL 11^" lloo^»/(24fc^)- Since \\K'^\\^ = O (l/ZifJ the resuh 
follows. 

Proof of Lemma^ For any compact subset C c]0, 1[, there exist < a < 6 < 1 such that 
C C [a,b]. Let a: G [a,b] and consider 



5n(x) - fix) = j^K^{u){f{x -u)~ f{x))du. (21) 
Let (^„) be a positive sequence tending to 0. Then, (|21l) is bounded above by 

\9n{x)-f{x)\ < sup \f{x-u)-f{x)\+ Kn{u)\J{x - u) - f{x)\du 

«|<5„ "'|m|>'5,i 



< sup |/(x-u)-/(x)|+2||/||^ / K^{u)du. 

m|<(5„ J\u\>S„ 

For n large enough, 5n < min(a, 1 — 6) and then |m| < 5n entails {x — u) G [0, 1]. Now, since / is 
a-Lipschitzian on [0, 1], it yields 



\9u{x) - f{x)\ < LfS: + 2II/IU / Kr.iu)du. (22) 

J\u\>5„ 



Two cases arise: 
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(i) If u v?K{u) is integrable then 

\gn(x) - f{x)\ < Lf6Z + 2II/IU ^ u'K{u)du ' . 

2 

Considering 5„ = hn^^ in this inequahty (which can also be found page 61 in j3l under 
different hypotheses) gives the result. 

(ii) If K has a compact support, let ^ > such that supp {K) C [—A, A]. Then, considering 

(5,1 = Ahn, the second term in (j22[) vanishes and the result is proved. 



Proof of Lemma\^ Consider x G C. As a consequence of the definitions 



^Un{x)) - fn{x) < —Y,Kn{x-Xr)max\E{Xl^)-f{a 



< 1 



max |E(X* ^) - f{xr 



< (l + o(l))max|E(X*,,)-/(x.)|, 
with Corollary [2} Besides, we have 



\EiXl,)-f{Xr)\<^ + 

' ' nc 



^{^n r) ~ knXn.r H 

nc 



+ \knXn,r ~ f{Xr)\ j 



and Lemma [T] yields 



E(/„) - fn 



-of ^ 

oo \ n 



O 



n 



o 



= o 



kn 



under (C). 



Proof of Lemma^ Let x E C. In view of the independence of the X* ^, r = 1, . . . , 

Var(/„(a;)) - ^ ^«(^ " ^'■) Var(X;j. 



Introducing 



AK = -J max 



Var(X,* 



and AKri 



we have 



Var(/„) 



< AKn{AVn + l/c') + AVn\\ K 



Lemma [T] shows that AVn 0, Corollary [2] applied to the kernel K"^ /\\ K \\^ shows that AKn — > 
as n ^ oo, and the conclusion follows. 
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Proof of Lemma\^ The bias expands as 

E(/„)-/ < ||/„-/||^+ E(/„)-/„ 

OO 

which first term is controlled by Proposition [T) Consider the second term: 

E(/„(x))-/„(a;) < (^^if„(x-x,)) max|E(X*,)+E(Z„)-/(x,)| 



< 1 



(23) 



max|E(X*,)+E(Z„)-/(x,)| 



Corollary [2] shows that it is sufficient to consider 

|E(X,;,)+E(Z„)-/(a:,)| < 
Lemma [Hand ([9]) yield 



E(^n r) ^ knXn.r H 

nc 



E(Z„) - 

nc 



E(/„) - /. 



c 



O 



o 



under (C), and the conclusion follows. 
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